13. Numerical Features & Feature Column API

Transform/Preprocess Numerical Features with Feature Column API

ND320 AIHCND C01 L03 A11 Transform Preprocess Numerical Features With Feature Column API

TensorFlow Dataset API

The TensorFlow Dataset (tf.data) API helps to build a flexible and efficient input pipeline that can deliver data to execute training steps. The pipeline helps aggregate and batches the data from various sources. In simple words, it makes an easy loading of the dataset.

We have introduced TensorFlow Dataset API because it is particularly helpful when the amount of data is enormous, available in different data-formats in a distributed file system, and requires some transformations while loading.

TensorFlow Feature Column API Key Points

The TensorFlow Feature Column API helps make data preprocessing easier by abstracting away some of the work for things like normalization in numerical features. If you have done this type of work in Scikit Learn or Pyspark, you might appreciate the work this API does for you when it comes to preparing features for modeling. It also has the ability to add less common features like cross features and shared embeddings.

Additional Resources

Transform/Preprocess Numerical Features

Transform/Preprocess Numerical Features

Numerical Features and TensorFlow Feature Columns API

To use the TensorFlow Feature Columns with numerical features we need to do the following:

  1. Identify the fields with numerical features.
  2. Use the TensorFlow Dataset API to load the dataset.
  3. Create your own custom normalizer function like a z-score
def z_score_normalizer(args):
    return z_score_normalization
  1. Use the TensorFlow numeric_column feature and pass in the z_score_normalizer function to the normalizer_fn argument.
    • tf.feature_column.numerical_column(column_name, normalizer_fn=z_score_normalizer)
  2. Let the TensorFlow Feature Column API do it's magic!

Additional Resources

-Normalize Features in TensorFlow

Code

If you need a code on the https://github.com/udacity.

Numerical Features

Which of the following is true about the numerical features in the TF Feature Column API?

SOLUTION:
  • You must identify the fields with numerical features.
  • You should use the TensorFlow Dataset API to convert the dataset to Tensorflow tensors for the TF Feature Column API.